Goto

Collaborating Authors

 Cache County


Towards a future space-based, highly scalable AI infrastructure system design

Arcas, Blaise Agüera y, Beals, Travis, Biggs, Maria, Bloom, Jessica V., Fischbacher, Thomas, Gromov, Konstantin, Köster, Urs, Pravahan, Rishiraj, Manyika, James

arXiv.org Artificial Intelligence

If AI is a foundational general-purpose technology, we should anticipate that demand for AI compute -- and energy -- will continue to grow. The Sun is by far the largest energy source in our solar system, and thus it warrants consideration how future AI infrastructure could most efficiently tap into that power. This work explores a scalable compute system for machine learning in space, using fleets of satellites equipped with solar arrays, inter-satellite links using free-space optics, and Google tensor processing unit (TPU) accelerator chips. To facilitate high-bandwidth, low-latency inter-satellite communication, the satellites would be flown in close proximity. We illustrate the basic approach to formation flight via a 81-satellite cluster of 1 km radius, and describe an approach for using high-precision ML-based models to control large-scale constellations. Trillium TPUs are radiation tested. They survive a total ionizing dose equivalent to a 5 year mission life without permanent failures, and are characterized for bit-flip errors. Launch costs are a critical part of overall system cost; a learning curve analysis suggests launch to low-Earth orbit (LEO) may reach $\lesssim$\$200/kg by the mid-2030s.


The Generalized Proximity Forest

Shaw, Ben, Rustad, Adam, Maia, Sofia Pelagalli, Rhodes, Jake S., Moon, Kevin R.

arXiv.org Machine Learning

Abstract--Recent work has demonstrated the utility of Random Forest (RF) proximities for various supervised machine learning tasks, including outlier detection, missing data imputation, and visualization. However, the utility of the RF proximities depends upon the success of the RF model, which itself is not the ideal model in all contexts. RF proximities have recently been extended to time series by means of the distance-based Proximity Forest (PF) model, among others, affording time series analysis with the benefits of RF proximities. In this work, we introduce the generalized PF model, thereby extending RF proximities to all contexts in which supervised distance-based machine learning can occur . Additionally, we introduce a variant of the PF model for regression tasks. We also introduce the notion of using the generalized PF model as a meta-learning framework, extending supervised imputation capability to any pre-trained classifier . We experimentally demonstrate the unique advantages of the generalized PF model compared with both the RF model and the k-nearest neighbors model.



Global Cross-Time Attention Fusion for Enhanced Solar Flare Prediction from Multivariate Time Series

Vural, Onur, Hamdi, Shah Muhammad, Boubrahimi, Soukaina Filali

arXiv.org Artificial Intelligence

Multivariate time series classification is increasingly investigated in space weather research as a means to predict intense solar flare events, which can cause widespread disruptions across modern technological systems. Magnetic field measurements of solar active regions are converted into structured multivariate time series, enabling predictive modeling across segmented observation windows. However, the inherently imbalanced nature of solar flare occurrences, where intense flares are rare compared to minor flare events, presents a significant barrier to effective learning. To address this challenge, we propose a novel Global Cross-Time Attention Fusion (GCTAF) architecture, a transformer-based model to enhance long-range temporal modeling. Unlike traditional self-attention mechanisms that rely solely on local interactions within time series, GCTAF injects a set of learnable cross-attentive global tokens that summarize salient temporal patterns across the entire sequence. These tokens are refined through cross-attention with the input sequence and fused back into the temporal representation, enabling the model to identify globally significant, non-contiguous time points that are critical for flare prediction. This mechanism functions as a dynamic attention-driven temporal summarizer that augments the model's capacity to capture discriminative flare-related dynamics. We evaluate our approach on the benchmark solar flare dataset and show that GCTAF effectively detects intense flares and improves predictive performance, demonstrating that refining transformer-based architectures presents a high-potential alternative for solar flare prediction tasks.


TAWRMAC: A Novel Dynamic Graph Representation Learning Method

Farokhi, Soheila, Qi, Xiaojun, Karimi, Hamid

arXiv.org Artificial Intelligence

Dynamic graph representation learning has become essential for analyzing evolving networks in domains such as social network analysis, recommendation systems, and traffic analysis. However, existing continuous-time methods face three key challenges: (1) some methods depend solely on node-specific memory without effectively incorporating information from neighboring nodes, resulting in embedding staleness; (2) most fail to explicitly capture correlations between node neighborhoods, limiting contextual awareness; and (3) many fail to fully capture the structural dynamics of evolving graphs, especially in absence of rich link attributes. To address these limitations, we introduce TAWRMAC-a novel framework that integrates Temporal Anonymous Walks with Restart, Memory Augmentation, and Neighbor Co-occurrence embedding. TAWRMAC enhances embedding stability through a memory-augmented GNN with fixedtime encoding and improves contextual representation by explicitly capturing neighbor correlations. Additionally, its Temporal Anonymous Walks with Restart mechanism distinguishes between nodes exhibiting repetitive interactions and those forming new connections beyond their immediate neighborhood. This approach captures structural dynamics better and supports strong inductive learning. Extensive experiments on multiple benchmark datasets demonstrate that TAWRMAC consistently outperforms state-of-the-art methods in dynamic link prediction and node classification under both transductive and inductive settings across three different negative sampling strategies. By providing stable, generalizable, and context-aware embeddings, TAWRMAC advances the state of the art in continuous-time dynamic graph learning. The code is available at https://anonymous.4open.science/r/tawrmac-A253 .



Mathematical Theory of Collinearity Effects on Machine Learning Variable Importance Measures

Bladen, Kelvyn K., Cutler, D. Richard, Wisler, Alan

arXiv.org Machine Learning

In many machine learning problems, understanding variable importance is a central concern. Two common approaches are Permute-and-Predict (PaP), which randomly permutes a feature in a validation set, and Leave-One-Covariate-Out (LOCO), which retrains models after permuting a training feature. Both methods deem a variable important if predictions with the original data substantially outperform those with permutations. In linear regression, empirical studies have linked PaP to regression coefficients and LOCO to $t$-statistics, but a formal theory has been lacking. We derive closed-form expressions for both measures, expressed using square-root transformations. PaP is shown to be proportional to the coefficient and predictor variability: $\text{PaP}_i = β_i \sqrt{2\operatorname{Var}(\mathbf{x}^v_i)}$, while LOCO is proportional to the coefficient but dampened by collinearity (captured by $Δ$): $\text{LOCO}_i = β_i (1 -Δ)\sqrt{1 + c}$. These derivations explain why PaP is largely unaffected by multicollinearity, whereas LOCO is highly sensitive to it. Monte Carlo simulations confirm these findings across varying levels of collinearity. Although derived for linear regression, we also show that these results provide reasonable approximations for models like Random Forests. Overall, this work establishes a theoretical basis for two widely used importance measures, helping analysts understand how they are affected by the true coefficients, dimension, and covariance structure. This work bridges empirical evidence and theory, enhancing the interpretability and application of variable importance measures.


Label-Guided Imputation via Forest-Based Proximities for Improved Time Series Classification

Rhodes, Jake S., Rustad, Adam G., Maia, Sofia Pelagalli, Thacker, Evan, Choi, Hyunmi, Gutierrez, Jose, Rundek, Tatjana, Shaw, Ben

arXiv.org Machine Learning

Missing data is a common problem in time series data. Most methods for imputation ignore label information pertaining to the time series even if that information exists. In this paper, we provide a framework for missing data imputation in the context of time series classification, where each time series is associated with a categorical label. We define a means of imputing missing values conditional upon labels, the method being guided by powerful, existing supervised models designed for high accuracy in this task. From each model, we extract a tree-based proximity measure from which imputation can be applied. We show that imputation using this method generally provides richer information leading to higher classification accuracies, despite the imputed values differing from the true values.


TIMED: Adversarial and Autoregressive Refinement of Diffusion-Based Time Series Generation

EskandariNasab, MohammadReza, Hamdi, Shah Muhammad, Boubrahimi, Soukaina Filali

arXiv.org Artificial Intelligence

Generating high-quality synthetic time series is a fundamental yet challenging task across domains such as forecasting and anomaly detection, where real data can be scarce, noisy, or costly to collect. Unlike static data generation, synthesizing time series requires modeling both the marginal distribution of observations and the conditional temporal dependencies that govern sequential dynamics. We propose TIMED, a unified generative framework that integrates a denoising diffusion probabilistic model (DDPM) to capture global structure via a forward-reverse diffusion process, a supervisor network trained with teacher forcing to learn autoregressive dependencies through next-step prediction, and a Wasserstein critic that provides adversarial feedback to ensure temporal smoothness and fidelity. To further align the real and synthetic distributions in feature space, TIMED incorporates a Maximum Mean Discrepancy (MMD) loss, promoting both diversity and sample quality. All components are built using masked attention architectures optimized for sequence modeling and are trained jointly to effectively capture both unconditional and conditional aspects of time series data. Experimental results across diverse multivariate time series benchmarks demonstrate that TIMED generates more realistic and temporally coherent sequences than state-of-the-art generative models.


BiasMap: Leveraging Cross-Attentions to Discover and Mitigate Hidden Social Biases in Text-to-Image Generation

Chakraborty, Rajatsubhra, Che, Xujun, Xu, Depeng, Faklaris, Cori, Niu, Xi, Yuan, Shuhan

arXiv.org Artificial Intelligence

Bias discovery is critical for black-box generative models, especiall text-to-image (TTI) models. Existing works predominantly focus on output-level demographic distributions, which do not necessarily guarantee concept representations to be disentangled post-mitigation. We propose BiasMap, a model-agnostic framework for uncovering latent concept-level representational biases in stable diffusion models. BiasMap leverages cross-attention attribution maps to reveal structural entanglements between demographics (e.g., gender, race) and semantics (e.g., professions), going deeper into representational bias during the image generation. Using attribution maps of these concepts, we quantify the spatial demographics-semantics concept entanglement via Intersection over Union (IoU), offering a lens into bias that remains hidden in existing fairness discovery approaches. In addition, we further utilize BiasMap for bias mitigation through energy-guided diffusion sampling that directly modifies latent noise space and minimizes the expected SoftIoU during the denoising process. Our findings show that existing fairness interventions may reduce the output distributional gap but often fail to disentangle concept-level coupling, whereas our mitigation method can mitigate concept entanglement in image generation while complementing distributional bias mitigation.